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(54)Titie: A METHOD FOR THE IDENTIFICATION OF PROTEASE INHIBITORS 
(57) Abstract 

Methods for detecting protease inhibitors are disclosed. Also described are DNA constructs and host cells transformed 
with these constructs for use in the subject methods. The methods utilize a host cell which exhibits a negative phenotype depend- 
ent on the activity of a given protease. Thus, inhibition of the protease confers a selectable phenotype on the cell. The negative 
phenotype can be conferred by either protease-mediated activation or inactivation of a protein conferring a selectable phenotype. 
The inhibitor is detected by transforming host cells expressing the genes for the selectable phenotype and a given protease with 
random peptide sequences. Inhibitors so identified can be used either directly or indirectly in the treatment of protease-dependent 
disorders. 
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FOR THE IDENTIFICATION OF PROTBASB INHIBITORS 



Technical Field 

The instant invention relates generally tq, the 
identification of protease inhibitors . More 
10 particularly, the invention relates to methods of 

identifying viral protease inhibitors which can in turn 
be used to treat or prevent viral infection. 

Background 

15 Proteases are enzymes that cleave peptide 

bonds, thereby altering proteins* Besides degrading 
proteins, these enzymes play a regulatory role in a 
variety of physiological processes. Proteases fall into 
four general classes: serine, cysteine, aspartic acid, 

20 and metalloproteases. These classes are distinguished 
primarily by mechanism (Dunn, B.M. , in Proteolytic 
Enzymes . R.J. Geynon and J.S. Bond, eds., IRL Press, 
Oxford, 1989, pp. 57-82). Serine and cysteine proteases 
have almost identical two-step mechanisms with an acyl- 

25 enzyme intermediate. Together they comprise the majority 
of the known proteases. Aspartic and metallo-proteases 
catalyze direct hydrolysis of the peptide bond. 

Many procaryotic and eucaryotic proteins are 
synthesized as larger biologically inactive precursors 

30 which become activated only when acted upon by 

endoproteases (proteinases) . These enzymes typically 
recognize specific domains, usually less than ten amino 
acids in length including the sissile bond, in exposed 

35 loops of generally loose secondary structure (Keil, B., 
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ir Mgt ^^ p j n ProtP ^n semienrp Analysis, M. Elzinga, ed., 
Humana Press, Clifton, N.J. , 1982, pp. 291-304). 

Maturation proteases are responsible for both 
intracellular 'and extracellular cleavage of protein 
5 precursors and many of the proteolytically processed 
proteins in turn play key roles in physiological 
abnormalities which give rise to disease states (Andrews, 
P.C., et al., Rvperientia (1987) 11:784-789; Reich, E. , 

e t al.r raid Spr^g Harbor Symposium: PgQteftgeg AHA 

10 pjffi ^rai rantrol . Cold Spring Harbor Laboratory, Cold 
Spring Harbor, 1975). Proteins that undergo 
intracellular proteolytic maturation include secreted 
proteins, lysosomal enzymes, mitochondrial proteins and 
membrane proteins. These proteins are highly diverse in 
15 function, having endocrine, neurological, and immune 
functions, as well as acting as growth factors and 
antibiotics. Secreted proteins that undergo 
extracellular proteolytic processing include the plasma 
zymogens involved in blood clotting and the immune 
20 complement system. 

Maturation proteases which are indirectly 
involved in human disease are generally distinguished by 
their high degree of substrate specificity. However, a 
host of digestive proteases of lesser specificity are 
25 also known and are more directly involved in diseases 

such as chronic inflammation and tumor metastasis. These 
enzymes include elastaBeB, collagenases , mast cell 
proteases, and extracellular matrix- degrading metallo- 
proteases, among others. 
30 Proteases also play key roles in many 

infectious diseases. An obligatory step in the 
replication of many pathogenic plant and animal viruses 
involves vims -determined proteolytic processing of the 
35 primary viral gene products (Hellen, C.U.T. , et al., 
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Biochemistry (1989) 2&:9881-9890) . Plant viruses which 
encoce proteases for this purpose include the 
potyviruses, camoviruses , nepoviruses, sobemoviruses , and 
luteoviruses . These viruses cause economically important 

5 diseases in all major xnonocot and dicot families. 
Similarly equipped animal viruses include the 
picornaviruses , retroviruses, alphaviruses , f laviviruses , 
pestiviruses , coronaviruses , and adenoviruses. Diseases 
caused by these viruses include foot-and-mouth disease, 

10 AIDS, the common cold, hepatitis and polio* 

For example, Zucchini Yellow Mosaic Virus 
(ZYMV) , a potyvirus, expresses its genome as a single 350 
kDa polyp rotein which is cleaved into at least seven 
mature gene products by three distinct proteolytic 

15 activities. Two of the proteases are virus -encoded 
(Dougherty, W.G. , and J.C. Carrington, hPVn Pw 
Phytonathol . (1988) 2fe:123-143; Carrington, J.C., et al., 
EMBO J. (1990) 2:1347-1353), including the potyviral 49 
kDa protease. This protease is responsible for at least 

20 five of the seven cleavages. This enzyme is a trypsin- 
like cysteine protease which is structurally and 
mechanistically representative of the largest class of 
viral proteases, including those of the animal 
picornaviruses (Dougherty, W.G. , et al., virology (1989) 

25 122:302-310; Bazan, J.F., and R.J. Fletterick, Proe. 

N*fcl. Acad. 5ci. USA (1988) fl£: 7872-7876) . This enzyme 
is highly specific and appears to recognize a region 
comprised of about seven amino acids surrounding the 
sissile bond (Dougherty, W.G., and T.D. Parks, virology 

30 (1989) 122:145-155). Of the five sites cleaved by this 
enzyme, the two flanking the protease appeax to be 
cleaved intramolecularly, while the remaining three 
appear to be cleaved intermolecularly (Garcia, J. A., et 

35 al., J- Gen. Virol. (1990) 21:2773-2779). Of the latter 
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three, the site between the Nib protein and the coat 
protein appears to be the most active. 

PliTlff""™ Qf th<a invention 
5 The invention herein is based on the discovery 

of a unique method for detecting peptide protease 
inhibitors. These inhibitors can be used directly or 
indirectly in the treatment of protease -dependent 
diseases- Alternatively, the inhibitors so identified 

10 can be utilized as structural models for the rational 
design of peptide-mimetics. 

Accordingly, in one embodiment, the subject 
invention is directed to a method for detecting a 
protease inhibitor which comprises: 

!5 (a) providing a population of host cells 

expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 

20 protease; 

(b) providing a pool of nucleic acid 
constructs wherein at least one of the constructs in the 
pool comprises a nucleic acid sequence encoding an 
inhibitor of the protease; 
25 (c) transforming the host cells of (a) with 

the nucleic acid constructs of (b) ; and 

(d) growing the transformed host cells of (c) 
under conditions that distinguish cells with the 
selectable phenotype, thereby detecting the presence of 
30 the protease inhibitor. 

In another embodiment, the subject invention is 
directed to a DNA construct comprising: 

(a) a first DNA coding sequence for a protein 
35 capable of conferring a selectable phenotype on a host 
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cell transformed therewith, the selectable phenotype 
dependent on the activity of a protease; and 

(b) control sequences that are operably linked 
to the first and second coding sequences whereby the 
5 coding sequences can be transcribed and translated in a 
host cell, and at least one of the control sequences is 
heterologous to at least one of the coding sequences. 

In an alternate embodiment, the DNA construct 
further includes a second DNA coding sequence for the 
10 protease of interest. 

In yet another embodiment, the subject 
invention is directed to host cells stably transformed 
with these DNA constructs. 

These and other embodiments of the subject 
15 invention will readily occur to those of ordinary skill 
in the art in view of the disclosure herein. . 

Brief Description of the Figures 

Figure 1 depicts Protease Inhibitor Selection 
20 system I, as applied to ZYMV protease. 

Figure 2 depicts representative examples of 
Protease inhibitor Selection System II. 

Figure 3 shows the strategy of cDNA synthesis 
from ZYMV and cloning methods. 
25 Figure 4 shows the nucleotide sequence of the 

ZYMV genome (SEQ ID NO:l) . 

Figure 5A shows the organization of the primary 
translation products of pZPro6, pZPro7 and placZa-CP. 
Figure 5B depicts the results of immunoblot analysis of 
30 SDS/PAGE separated proteins from £s. coli cells harboring 
these plasmids. 

Figure 6 depicts the derivation of the pZPro7, 
pZPro9, pZProlO, pZProll and pZProl2 constructs and the 
35 organization of the primary translation products. The 
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open boxes denote ZYMV 49 kDa protease (Pro) cleavage 
9ites. Strep R « streptomycin- resistant . Amp R » 
ampicillin- resistant, i.e., transformed. Cfu * colony- 
forming units. NT « not tested. 

5 

nailed Description of the Invention 

The practice of the present invention will 
employ, unless otherwise indicated, conventional r 
techniques of molecular biology, microbiology, virology, 

10 recombinant DNA technology, and immunology, which are 
within the s)cill of the art. Such techniques are 
explained fully in the literature. Ss£, SiS*.* Sambrook, 

Fritsch & Maniatis, Molecular Cloning : ft Laboratory 

ifeaual, Second Edition (1989); Maniatis, Fritsch & 

15 Sambrook, Molecular Cloning: A Laboratory MMUftl (1982); 
dwa. Cloning . Vols. I and II (D.N. Glover ed. 1985); 
oligonucleo tide Synthesis (M.J. Gait ed. 1984); EufilfiiS 
ftc ^a Hybridization (8.D. Hames & S.J. Higgins eds 1984); 
Animal Cell Culture (R.K. Freshney ed. 1986); Imrobilized 

20 Pells and Enzymes (IRL press, 1986); B. Perbal, & 

Practical Guide to Molecular Cloning (1984); the series, 
Methods In Enzvmoloav (S. Colowick and N. Kaplan eds., 
Academic Press, Inc.); and Handbook of Experimental 
immunology . Vols. I -IV (D.M. Weir and C.C. Blackwell 

25 eds., 1986, Blackwell Scientific Publications). 

All patents, patent applications, and 
publications mentioned herein, whether supra or infra, 
axe hereby incorporated by reference in their entirety. 



30 A. Definitions 

In describing the present invention, the 
following terms will be employed, and are intended to be 
defined as indicated below. 

35 
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By "protease" is meant an enzyme that cleaves a 
peptide bond. The term includes both endopeptidases 
(also called proteinases) which are proteases that 
hydrolyze internal peptide bonds, and exopeptidases, 
5 which are proteases that cleave either N- or C- terminal 
peptide bonds. Some proteases are highly specific, 
cleaving only between two particular amino acids within a 
particular protein. Other proteases are less specific, 
cleaving between more than one amino acid pair and/or 

10 cleaving between an amino acid pair in more than one 
location in the same and/or different proteins. 
Exemplary proteases include maturation proteases 
responsible for both intracellular and extracellular 
cleavage of protein precursors , such as secreted 

15 proteins, lysosomal enzymes, mitochondrial proteins, 
membrane proteins, plasma zymogens, digestive enzymes, 
elastases, collagenases, mast cell proteases, 
extracellular matrix- degrading metalloproteinases; plant 
viral proteases such as proteases from potyviruBes, 

20 camoviruses, nepoviruses, sobemoviruses , and 

luteoviruses; and animal viral proteases such as 
proteases from picornaviruses , retroviruses, 
alphaviruses , f laviviruses , pestiviruses, coronaviruses , 
and adenoviruses. 

25 By "protease inhibitor" is meant a molecule 

capable of altering the activity of a protease such that 
the protease is unable to completely hydrolyze a peptide 
bond for which it is specific. Protease inhibitors can 
be peptides composed solely of genetically encodable 

30 amino acids. "Protease inhibitor" also encompasses 

synthetic peptide derivatives such as peptide aldehydes 
and ketones, peptide boronic acids, peptide chloramethyl 
ketones, azapeptides, peptide hydroxamic acids, and 

35 peptide thiols. "Protease inhibitor" also encompasses 
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synthetic nonpeptide compounds such as 
diiswpropylphosphof luoridate , sulf onylf luorides , 
phoaphoramidon, and halomethylcoumarins . Por a detailed 
discussion of protease inhibitors, see Proteinase 
5 Tnh-ibitora . A.J. Barrett and G. Salvesen, eds. (Elsevier, 

Amsterdam, 1986) . 

The terms "peptide" and "protein" are used in 
their broadest sense, i.e., any polymer of genetically 
encodable amino acids (dipeptide or greater) linked 

10 through peptide bonds. Thus, the terms include 

oligopeptides, polypeptides, protein fragments, muteins, 
fusion proteins and the like. 

A "host cell" is a cell which has been 
transformed, or is capable of transformation, by an 

15 exogenous nucleotide sequence. As described more fully 
below, host cells for use in the present invention may be 
either procaryote or eucaryote, depending on the specific 
protease in question and the selection system desired. 
In general, bacterial cells (either gram- negative or 

20 gram- positive) are the hosts of choice when the protease 
and its dependent phenotype can be expressed in active 
form in these cells. Eucaryotic cells can be used, 
however, in cases where either the protease or itB 
dependent phenotype can be adequately expressed only in 

25 such cells, such as cases in which certain types of 

transport, metabolism, or post-translational modification 
are required. Eucaryotic cells can also be used to 
select inhibitors of other types of biological activities 
which can be expressed only in such cells, such as animal 
30 virus replication. One skilled in the art can readily 

determine an appropriate host cell for use in the present 
invention. 

A "replicon" is any genetic element (e.g., 
35 plasmid, chromosome, virus) that functions as an 
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autonomous unit of DNA replication; i.e., capable of 
replication under its own control. 

A "vector" is a replicon, such as a plasmid, 
phage, or cosmid, to which another DNA segment may be at* 
5 tached so as to bring about the replication of the at- 
tached segment. 

A "double-stranded DNA molecule" refers to the 
polymeric form of deoxyribonucleotides (bases adenine, 
guanine,- thymine, or cytosine) in a double- stranded 
10 helix, both relaxed and super coiled. This term refers 
only to the primary and secondary structure of the 
molecule, and does not limit it to any particular 
tertiary forms. Thus, this term includes double- stranded 
DMA found, inter alia , in linear DNA molecules (e.g., 
15 restriction fragments) , viruses, plasmids, and 

chromosomes. In discussing the structure of particular 
double- stranded DNA molecules, sequences may be described 
herein according to the normal convention of giving only 
the sequence in the 5 ' to 3' direction along the 
20 nontranscribed strand of DNA (i.e., the strand having the 
sequence homologous to the mRNA) . 

A DNA "coding sequence" or a "nucleotide 
sequence encoding" a particular protein, is a DNA 
sequence which is transcribed and translated into a 
25 polypeptide in vivo when placed under the control of 

appropriate regulatory sequences. The boundaries of the 
coding sequence are determined by a start codon at the 5' 
(amino) terminus and a translation stop codon at the 3' 
(carboxy) terminus. A coding sequence can include, but 
30 is not limited to, procaryotic sequences, cDNA from 

eucaryotic mRNA, genomic DNA sequences from eucaryotic 
(e.g., mammalian) DNA, and even synthetic DNA sequences. 
A transcription termination sequence will usually be 
35 located 3' to the coding sequence. 
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A "promoter sequence" is a DNA regulatory 
region capable of binding RNA polymerase in a cell and 
initiating transcription of a downstream (3' direction) 
coding sequence. For purposes of defining the present 
5 invention, the promoter sequence is bound at the 3' 

terminus by the translation start codon (ATG) of a coding 
sequence and extends upstream (5' direction) to include 
the minimum number of bases or elements necessary to 
initiate transcription at levels detectable above 

10 background. Within the promoter sequence will be found a 
transcription initiation site (conveniently defined by 
mapping with nuclease SI) , as well as protein binding 
domains (consensus sequences) responsible for the binding 
of RNA. polymerase. Eucaryotic promoters will often, but 

15 not always, contain "TATA" boxes and "CAT" boxes. 

Procaryotic promoters contain Shine -Dalgarno sequences in 
addition to the -10 and -35 consensus sequences. 

DNA "control sequences" refers collectively to 
promoter sequences, ribosome binding sites, 

20 polyadenylation signals, transcription termination 

sequences, upstream regulatory domains, enhancers, and 
the like, which collectively provide for the 
transcription and translation of a coding sequence in a 
host cell. 

25 A coding sequence is "operably linked to" 

another coding sequence when RNA polymerase will 
transcribe the two coding sequences into mRNA, which is 
then translated into a chimeric polypeptide encoded by 
the two coding sequences. The coding sequences need not 

30 be contiguous to one another so long as the transcribed 
sequence is ultimately processed to produce the desired 
chimeric protein. 

A control sequence "directs the transcription" 

35 of a coding sequence in a cell when RNA polymerase will 
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5 bind the promoter sequence and transcribe the coding 

sequence into mRNA, which is then translated into the 
polypeptide encoded by the coding sequence. 

A cell has been "transformed" by an exogenous 
5 nucleotide sequence when the sequence has been introduced 
inside the cell membrane . An exogenous nucleotide 
sequence may or may not be integrated (covalently linked) 
to chromosomal nucleic acid making up the genome of ?the 
cell. In procaryotes and yeasts, for example, the 

10 exogenous nucleotide sequence may be maintained on an 
episomal element, such as a plasmid. With respect to 
most other eucaryotic cells, a stably transformed cell is 
one in which the exogenous nucleotide sequence has become 
integrated into the chromosome so that it is inherited by 

15 daughter cells through chromosome replication. This 
stability is demonstrated by the ability of the 
eucaryotic cell to establish cell lines or clones 
comprised of a population of daughter cell containing the 
exogenous sequence. 

20 A "clone" is a population of cells derived from 

a single cell or common ancestor by mitosis. A "cell 
line" is a clone of a primary cell that is capable of 
stable growth An vitro for many generations. 

A "heterologous" region of a DNA construct is 

25 an identifiable segment of DMA within or attached to 
another DNA molecule that is not found in association 
with the other molecule in nature. Thus, when the 
heterologous region encodes a bacterial gene, the gene 
will usually be flanked by DNA that does not flank the 

30 bacterial gene in the genome of the source bacteria. 

Another example of the heterologous coding sequence is a 
construct where the coding sequence itself is not found 
in nature (e.g., synthetic sequences having codons 

35 different from the native gene) . Allelic variation or 
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naturally occurring mutational events do not give rise to 
a heterologous region of DNA, as used herein. 

The term "treatment" as used herein refers to 
either (i) the prevention of infection or reinfection 
5 (prophylaxis) , or (ii) the reduction or el imina tion of 
syn^tams of the disease of interest (therapy) . 

B. ft*ngral Methods 

Described herein is a system which can be used 

10 to select effective inhibitors of proteases from large 

pools of random peptide sequences. The method utilizes a 
known protease which can be obtained through standard 
techniques, i.e. direct isolation, synthesis or 
recombinant technology. The nucleotide sequence of the 

15 protease can be determined and used to transform a host 
cell. The host cell is also transformed with a 
nucleotide sequence encoding a protein that confers a 
negative pheaotype on the cell, such as sensitivity to a 
given antibiotic, or inability to grow on a given carbon 

20 source, which is dependent on the activity of the cloned 
protease. Genes for the protease and the protein 
conferring the dependent phenotype are contained on one 
or more constructs which have been introduced into the 
host cell. Thus, inhibition of the protease confers a 

25 selectable phenotype on the cell (e.g., antibiotic 
resistance, or the ability to grow on a given carbon 
source) . Once identified, the particular inhibitor can 
be isolated, sequenced and further used as described 
below. 

30 The negative phenotype may be expressed by 

either of two general mechanisms. In the first, a gene 
conferring a dominant negative phenotype is expressed as 
an inactive precursor protein which is activated by 

35 protease -mediated cleavage at a site which has been 
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engineered to resemble a natural substrate of the 
protease. In the second mechanism, a gene conferring a 
selectable phenotype is inactivated by protease -mediated 
cleavage at a similarly engineered site. 
5 The above -described host cells cam then be used 

to detect effective inhibitors of the protease from large 
pools of random peptides encoded on another plasxnid. 
Cells transformed with variants of this plasmid which 
encode effective inhibitors are identified by selection 

10 for the appropriate phenotype. This additional plasmid 
contains a gene which encodes a "carrier" protein in 
which all or part of an exposed domain has been 
randomized with respect to its amino acid sequence. 
Typically, the randomized domain may range from four to 

15 fifteen amino acids in length. The length of the 
randomized amino acid sequence will depend on the 
specific application of the inhibitor and can be readily 
determined by one skilled in the art. For example, 
peptides intended for use for the design of peptide 

20 mimetics will tend to have shorter sequences than 
peptides for use in peptide or gene therapy. 

Part or all of this sequence is randomized with 
respect to the twenty genetically- encodable amino acids. 
Thus, a fully randomized set of heptapeptide sequences 

25 would contain more than 10 9 different peptides. 

Such random sequence "libraries" can be 
constructed by replacing the sequence encoding the 
exposed domain in the "carrier" protein gene with a set 
of synthetic oligodeoxynucleotides of random sequence. A 

30 natural substrate of the protease in question can be 
conveniently used for the "carrier" protein. Alterna- 
tively, one of the many well -characterized natural 
protease inhibitors may be used (Proteinase Inhibitors, 

35 A.J. Barrett and G. Salvesen, eds. (Elsevier, Amsterdam, 
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1986) , section C) , in which the amino acid sequence of 
the native binding domain has been randomized. 
Structural constraints placed on the randomized sequence 
by the flanking domains of the "carrier" may be minimized 
5 by flanking the randomized sequence with short "spacers" 
of polyproline or polyglycine which are highly flexible 
(Creighton, T.E., proteins: structures and Molecular 
Properties . W.H. Freeman, New York, ,1984, ch. 5). 

Some of these "random 11 peptides will, by 
10 chance, have structures which are capable of binding 
tightly to the active site of the protease, thereby 
preventing it from either activating or inactivating the 
negative phenotype -conferring protein, depending on the 
mechanism employed. This, in turn, will confer the 
15 selectable phenotype on the host cell when transformed 
with these constructs. The structures of effective 
inhibitors can then be determined by sequencing the 
appropriate regions of constructs recovered from such 
phenotype- selected cells. 
20 A representative example of the first mec hani sm 

described above, i.e., wherein the activation of a 
negative phenotype- conferring protein is inhibited 
(hereinafter referred to as Protease Inhibitor Selection 
System I) as applied to the ZYMV protease is illustrated 
25 in Figure l. A portion of the ZYMV polyprotein is 

depicted which includes the protease, replicase (Nib), 
and coat protein (see U.S. Patent Application Serial No. 
07/560,130). The arrows indicate substrate sites at 
which the protease cleaves the polyprotein. In this 
30 selection system, either the replicase or coat protein is 
replaced with the coding sequence for the protein 
conferring the negative phenotype. An example of the 
latter includes JL. coli ribosomal protein S12, which 
35 confers sensitivity to streptomycin on streptomycin- 
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resistant hosts such as IL. £2li strain MC1009 {Post, 
L.E., and M. Nomura, J. Biol- Chem. (1980) 2J55l:4660- 
4666). A transcription repressor protein may also be 
used such as the lactose, tryptophan, or phage lambda 
5 repressors (Levin, B., Genes IV. Cell Press, Cambridge, 
1990, pp. 240-264). These act by repressing expression 
of antibiotic resistance genes in hosts in which these 
genes are transcribed from repressible promoters. In 
either case the negative phenotype is only displayed when 

10 the negative phenotype protein is being actively cleaved 
out of the polyprotein by the protease. 

For other proteases it may be convenient to 
express the protease and the negative phenotype precursor 
from separate transcription units. The only requirements 

15 are that the negative phenotype protein be linked to Em 
extraneous domain by a peptide sequence which is a 
natural substrate for the protease in question, and that 
this precursor be inactive until cleaved by the protease. 
Figure 2 illustrates the second mech a n ism 

20 wherein the negative phenotype is conferred by protease - 
mediated deactivation of a protein conferring a 
selectable phenotype (referred to herein as Protease 
Inhibitor Selection System II) . Examples of such 
proteins include secreted or membrane proteins which 

25 confer resistance to the antibiotics ampicillin, 
tetracycline, or Icanamycin (Methods fa Rngymoloav. 
vol. 43, Academic Press, New York, 1975), or which confer 
the ability to utilize carbon sources such as lactose or 
maltose (Bieker, K.L., and T.J. Silhavy, Trends in 

30 Genetics (1990) £:329-334) . These proteins are normally 
expressed as precursors in which an amino- ter min al signal 
sequence directs transport of the protein across the cell 
membrane or insertion of the protein into the cell 
35 membrane, after which the signal sequence is 
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proteolytically removed. In these constructs, the 
protease substrate peptide sequence is inserted between 
the signal sequence and the mature protein such that 
cleavage by the protease renders the protein incapable of 
5 membrane transport or insertion and thereby inactive. 
Alternatively, the protease substrate sequence may be 
inserted into a surface domain of the mature protein such 
that cleavage by the protease renders the protein > 
inactive . 

10 A special case of this selection system occurs 

with proteases which are toxic when expressed in Si CPli 
by virtue of their fortuitous inactivation of one or more 
host proteins which are required for growth (Baum, E.Z., 
et al., PT-or. Natl. Acad. Sei . USA (1990) flZ: 5573 -5577) . 

15 In such cases, the inactivated host protein (s) confer the 
selectable phenotype in the presence of inhibitors of the 
protease . 

The random peptide inhibitor gene library may 
be delivered to the selector cells by any of several 

20 methods, the choice of which will depend to some extent 
on the size of the library. One skilled in the art can 
readily determine an acceptable technique to use with a 
given library. For example, chemical transformation with 
purified plasmid (Sambrook, J., et al., Molecular 

25 £lsaina# Cold Spring Harbor Laboratory, 1989, pp. 1.76- 
1.84) can be used for libraries of up to 10 8 -10 9 members, 
depending on the efficiency. Such a library can 
accommodate a complete set of fully randomized 
pentapeptides . High voltage electroporation with 

30 purified plasmid (Dower, W.J., et al., Nucleic Acidg Reg- 
(1988) 1£: 6127-6145) is useful for libraries of 10 10 -10 11 
members, nearly sufficient to accommodate a complete set 
of fully randomized heptapeptides. For larger libraries, 
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bacteriophage- derived vectors can be used for delivery by 
transduction. 

For example, a plasmid vector can be converted 
to a cosmid vector (Sambrook, J*, et al., Molecular 

5 Cloning . Cold Spring Harbor Laboratory, 1989, ch. 3) 
simply by insertion of a cos site and an appropriate 
length of " stuff er" DNA. Concatenate ligation of the 
library to such a vector can be followed by efficient 
packaging into phage X pseudovirions using commercially 
10 available preparations. Efficient, large-scale 

transductions of the packaged cosmids into selector cells 
can then be accomplished by established methods. In a 
further refinement, concatemers of the inhibitor gene 
library can be used instead of ■stuffer" DNA in the 

15 cosmid to achieve the necessary size for packaging. This 
reduces, by an order of magnitude, the number of 
trans formants that need to be screened to cover the 
entire library. 

The stringency of selection by these systems 

20 can be adjusted in a variety of ways. A number of 
transcriptional promoters and enhancers of varying 
strengths are available (Sambrook, J., et al., Molecular 
Cloning . Cold Spring Harbor Laboratory, 1989, ch. 17), 
which can be used with the protease, negative phenotype 

25 precursor, and inhibitor genes to raise or lower the 
inhibitor strength required for selection. Inducible 
promoters can be used, such that their strengths may be 
titrated by adjusting the amount of inducer in the growth 
medium. For example, by having the inhibitor expressed 

30 from an inducible promoter, the potency threshold of a 
pool of selected inhibitors is raised, and the size of 
the pool reduced, simply by reducing the amount of 
inducer present during selection. In addition, such 

35 inhibitor inducibility can be used to counterselect 
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scable false positives, such as revertants that have 
muta-2d the protease gene, simply by replica plating onto 
selective medium in the absence of inducer. Only the 
revertants are able to grow. 
5 Once detected, the protease inhibitor can be 

isolated and chemically characterized, using known 
techniques. These systems can be used to generate 
inhibitor peptides for any protease which can be 
expressed in active form in a suitable host and for which 

10 substrate cleavage site sequences are known. In addition 
to the proteases of many important plant and animal viral 
pathogens, inhibitors of the proteases of other types of 
microbial pathogens as well as cellular proteases which 
have been implicated in such disorders as rheumatoid 

15 arthritis, Alzheimer's disease, and tumor metastasis, can 
also be identified. 

C. HA* and Administration 

The instant invention can be used to identify 

20 protease inhibitors which in turn are useful in the 

treatment of protease -dependent diseases in both plants 
and animals. The inhibitors can be used directly in 
peptide therapy or can be encoded in a gene and used in 
gene therapy. The identified inhibitors can also serve 

25 as structural models for the rational design of peptide- 
mimetics, that is, synthetic compounds that mimic the 
protease-binding action of the identified protease 
inhibitors to bring reactive groups into contact with the 
protease active site. (See, e.g., Demuth, H.-U., jL_ 

30 ff pgym inhibition (1990) 2:249-278) . The present 

invention also has a more general application in the 
construction and use of in vivo systems for the selection 
of bioactive peptides from peptide libraries. For 

35 example, systems may be designed for the selection of 
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peptide inhibitors of hydrolytic enzymes other than 
proteases. A number of such enzymes are known which can 
hydrolyze natural or artificial substrates to produce one 
or more compounds which are toxic to EL £Qli (for 
5 example, see Hyflroivtic Enzymes. A. Neuberger and K. 
Brocklehurst, eds. Elsevier, Amsterdam, 1987). The 
expression of such an enzyme in an appropriate host cell 
allows the selection of peptide inhibitors of the enzyme 
based on their ability to confer viability on the cells 
10 in the presence of toxigenic substrates. 

In general, any phenotype of cultured 
procaryotic or eucaryotic cells which can be altered by 
the endogenous expression of appropriate peptides such 
that cells expressing such peptides can be readily 
15 distinguished and isolated from cells which either do not 
express such peptides or which express peptides which do 
not alter the phenotype, may provide the basis for 
establishing an in vivo system for the selection of 
bioactive peptides from peptide libraries. Among the 
20 most tractable medically important phenotypes will be 
those manifesting susceptibility to microbial 
pathogenicity. 

For example, the endogenous expression of a 
random peptide library as an exposed domain of a suitable 
25 stable "carrier" protein in a population of cultured 

mammalian cells of sufficient size to ensure that all or 
most members of the library are represented in the 
population may be used to select peptides which interfere 
with the ability of microbial pathogens such as viruses 
30 or bacteria or their toxins to inhibit cell growth. When 
such cell populations are challenged by such pathogens or 
toxins, only those cells expressing inhibitory peptides 
will grow, allowing the active peptides to be identified 
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by established methods. Such peptides can in turn be 
used for the development of effective therapies. 

For the treatment of plant pathogenesis, the 
identified inhibitors can be used to create transgenic 
5 plants. One commonly used method of gene transfer in 
plants involves the insertion of the gene of interest 
into the T-DNA region of a Ti or Ri plasmid derived from 
flu tumefaciens or ^ xfcizflSfinSA. respectively. Many 
control sequences are known which when coupled to a 

10 heterologous coding sequence and transformed into a host 
organism show fidelity in gene expression with respect to 
tissue/organ specif icity of the original coding sequence. 
See, e.g., Benfey, P.N., and Chua, N.H., Science (1989) 
244 :174-181. Suitable control sequences for use in these 

IS plasmids include promoters for constitutive leaf -specif ic 
expression of the desired gene in the various • target 
plants. Other useful control sequences include a 
promoter and terminator from the nopal ine synthase gene 
(NOS) . The NOS promoter and terminator are present in 

20 the plasmid pARC2, available from the American Type 

Culture Collection and designated ATCC 67238- If such a 
system is used, the virulence teir) gene from either the 
Ti or Ri plasmid must also be present,, either along with 
the T-DNA portion, or via a binary system where the xir 

25 gene is present on a separate vector. Such systems, 
vectors for use therein, and methods of transforming 
plant cells are described in U.S. Patent No. 4,658,082, 
and Simpson, R.B., et al., Mol. Biol. (1986) £:403- 

415, incorporated herein by reference in their entirety. 

30 Once constructed, these plasmids can be placed 

into Su. rhizooenes or iu ti"mfacienB and these vectors 
used to transform cells of plant species which are 
ordinarily susceptible to the particular plant pathogen. 

35 The selection of either i-ymgfaciens or ^ rhiZPTePeg 
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will depend on the plant being transformed thereby. In 
general, ^ tumefaciens is the preferred organism for 
transformation. Most dicotyledons, some gymnosperms, and 
a few monocotyledons (e.g., certain members of the 
5 Liliales and Arales) are susceptible to infection with 2L. 
tumef aciens . A. rhizoaenes also has a wide host range, 
embracing most dicots and some gymnosperms, which 
includes members of the Leguminosae, Compos itae and?* 
Chenopodiaceae . Alternative techniques which have proven 

10 to be effective in genetically transforming plants 

include particle bombardment and electroporation. See, 
e.g., Rhodes, C.A. , et al«, Science (1988) 2£fl:204-207; 
Shigekawa, K. , and Dower, W.J., BioTechniques (1988) 
£:742-751; Sanford, J.C., et al., Particulate Science and 

15 Technology (1987) £:27-37; and McCabe, D.E., 
BioTechnoloov (1988) £:923-926. 

Once transformed, these cells can be used to 
regenerate transgenic plants. For example, whole plants 
can be infected with these vectors by wounding the plant 

20 and then introducing the vector into the wound site. Any 
part of the plant can be wounded, including leaves, stems 
and roots. Alternatively, plant tissue, in the form of 
an explant, such as cotyledonary tissue or leaf disks, 
can be inoculated with these vectors and cultured under 

25 conditions which promote plant regeneration. Roots or 

shoots transformed by inoculation of plant tissue with JL. 
rhiaoqeneB or tumef acienB. containing the desired 
gene, can be used as a source of plant tissue to 
regenerate transgenic plants, either via somatic 

30 embryogenesis or organogenesis. Examples of such methods 
for regenerating plant tissue are disclosed in Shahin, 
E.A., Theor. AppI. Genet. (1985) £2:235-240; U.S. Patent 
No. 4,658,082; and Simpson et al., supra. 
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The inhibitors identified by the present method 
can also be used in gene therapy. For example, HIV- 
specific protease inhibitor genes, in which a natural 
mammalian protease inhibitor serves as carrier for the 
5 HIV protease inhibitor domain, can be used in anti-AIDS 
gene therapy. Lymphocytes or bone marrow cells from the 
patient can be transformed with the protease inhibitor 
gene in vitro and returned to the patient, where they 
establish an HIV- resistant subpopulation of lymphocytes 
10 which can gradually restore cell -mediated immune function 
as the patient's untransf ormed lymphocytes are depleted 

by the virus. 

Similarly, proteases active in blood, lymph, or 
cerebro- spinal fluid which are essential components of 

15 disorders such as chronic inflammations, metastatic 

cancers, and certain viral infections, may be targeted by 
protease inhibitor gene therapy, in which the inhibitors 
are secreted by transgenic lymphocytes or other 
transgenic cell implants. 

20 For therapeutic use in animals, the inhibitors 

identified by the present method can be altered by 
established methods to improve their pharmaco- kinetic 
properties. For example, the inhibitors may be 
administered linked to a carrier. For example, a 

25 fragment may be conjugated with a macromolecular carrier. 
Suitable carriers are typically large, slowly metabolized 
macromolecules such as: proteins; polysaccharides, such 
as sepharose, agarose, cellulose, cellulose beads and the 
like; polymeric amino acids such as polyglutamic acid. 

30 polylysine, and the like; amino acid copolymers; and in- 
active virus particles. Especially useful protein 
substrates are serum albumins, keyhole limpet hemocyanin, 
immunoglobulin molecules, thyroglobulin, ovalbumin, and 
35 other proteins well known to those skilled in the art. 
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The protein substrates may be used in their na- 
tive form or their functional group content may be 
modified by, for example, succinylation of lysine 
residues or reaction with Cys-thiolactone. A sulfhydryl 
5 group may also be incorporated into the carrier (or 

inhibitor) by, for example, reaction of amino functions 
with 2-iminothiolane or the N-hydroxyBuccinimide ester of 
3- (4-dithiopyridyl) propionate. Suitable carriers may 
also be modified to incorporate spacer arms (such as 

10 hexamethylene diamine or other bifunctional molecules of 
similar size) for attachment of peptides. Methods of 
coupling peptides to proteins or cells are known to those 
of skill in the art. 

It is also possible to administer the 

15 inhibitors identified using the instant method alone, or 
mixed with a pharmaceutically acceptable vehicle or 
excipient. Typically, the compositions are prepared as 
in jec tables, either as liquid solutions or suspensions; 
solid forms suitable for solution in, or suspension in, 

20 liquid vehicles prior to injection may also be prepared. 
The preparation may also be emulsified or the active 
ingredient encapsulated in liposome vehicles. The active 
immunogenic ingredient is often mixed with vehicles 
containing excipient s which are pharmaceutically accept - 

25 able and compatible with the active ingredient. Suitable 
vehicles are, for example, water, saline, dextrose, 
glycerol, ethanol, or the like, and combinations thereof* 
In addition, if desired, the vehicle may contain minor 
amounts of auxiliary substances such as wetting or 

30 emulsifying agents, or pH buffering agents. Actual 

methods of preparing such dosage forms are known, or will 
be apparent, to those skilled in the art. Ssa, g t fl i , 
Remington's Pharmaceutical Sciences, Mack Publishing 
35 Company, Easton, Pennsylvania, 15th edition, 1975. The 
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composition or formulation to be administered will, in 
any event, contain a quantity of the inhibitor adequate 
co achieve the desired effect in the individual being 
treated. 

5 Additional formulations which axe suitable for 

other modes of administration include suppositories and, 
in same cases, aerosol, intranasal, oral formulations, 
and sustained release formulations. For suppositories, 
the vehicle composition will include traditional binders 

10 and carriers, such as, polyalkylene glycols, or 

triglycerides. Such suppositories may be formed from 
mixtures containing the active ingredient in the range of 
about 0.5% to about 10% (w/w) , preferably about 1% to 
about 2%. Oral vehicles include such normally employed 

15 excipients as, for example, pharmaceutical grades of 
mannitol, lactose, starch, magnesium, stearate, sodium 
saccharin cellulose, magnesium carbonate, and the like. 
These oral compositions may be taken in the form of 
solutions, suspensions, tablets, pills, capsules, 

20 sustained release formulations, or powders, and contain 
from about 10% to about 95% of the active ingredient, 
preferably about 25% to about 70%. 

Intranasal formulations will usually include 
vehicles that neither cause irritation to the nasal 

25 mucosa nor significantly disturb ciliary function. 

Diluents such as water, aqueous saline or other known 
substances can be employed with the subject invention. 
The nasal formulations may also contain preservatives 
such as, but not limited to, chlorobutanol and 

30 benzalkonium chloride. A surfactant may be present to 
enhance absorption of the subject proteins by the nasal 
mucosa . 

Controlled or sustained release formulations 
35 are made by incorporating the inhibitor into carriers or 
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vehicles such as liposomes, nonresorbable impermeable 
polymers such as ethylenevinyl acetate copolymers and 
Hytrel* copolymers, swellable polymers such as hydrogels, 
or resorbable polymers such as collagen and certain 
5 polyacids- or polyesters such as those used to make 

resorbable sutures. The inhibitors can also be delivered 
using implanted mini -pumps, well known in the art. 

Furthermore, the inhibitors (or complexes 
thereof) may be formulated into pharmaceutical compos i - 
10 tions in either neutral or salt forms. Pharmaceutical^ 
acceptable salts include the acid addition salts (formed 
with the free amino groups of the active polypeptides) 
and which are formed with inorganic acids such as, for 
example, hydrochloric or phosphoric acids, or such 
15 organic acids as acetic, oxalic, tartaric, mandelic, and 
the like. Salts formed from free carboxyl groups may 
also be derived from inorganic bases such as, for 
example, sodium, potassium, ammonium, calcium, or ferric 
hydroxides, and such organic bases as isopropylaxnine, 
20 trimethylamine , 2-ethylamino ethanol, histidine, 
procaine, and the like. 

To treat an animal subject, the inhibitor of 
interest is administered parenterally, usually by 
intramuscular injection in an appropriate vehicle. Other 
25 modes of administration, however, such as subcutaneous, 
intravenous injection and intranasal delivery, are also 
acceptable. Injectable formulations will contain an ef- 
fective amount of the active ingredient in a vehicle, the 
exact amount being readily determined by one skilled in 
30 the art. The active ingredient may typically range from 
about 1% to about 95% (w/w) of the composition, or even 
higher or lower if appropriate. The quantity to be 
administered depends on the animal to be treated and the 
35 particular inhibitor used. Effective dosages can be 
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readily established by one of ordinary skill in the art 
through routine trials establishing dose response curves. 
The subject is treated by administration of the 
particular inhibitor, in at least one dose. Moreover, 
5 the subject may be administered as many doses as is 
required to effectively treat the individual. 

Below are examples of specific embodiments for 
carrying out the present invention. The examples are of- 
fered for illustrative purposes only, and are not 
10 intended to limit the scope of the present invention in 
any way. 

EXAMPLES 

15 Example 1 

Tftp cfftTifttte expression of active ZYMV 49 KP» PTOMMC tP 
E - cpli. 

This exanqple describes the construction and 
expression in coli of a gene which encodes a portion 

20 of the ZYMV polyprotein. The primary translation product 
of this gene is a 140 kDa protein which includes the 
49 kDa protease and flanking cleavage sites, a portion of 
the nuclear inclusion % b' protein (Nib, also referred to 
as the replicase) , including the Nib/coat protein 

25 cleavage site, followed by the coat protein (CP). 

Evidence is presented showing that the expression of this 
gene in coli leads to an accumulation of mature CP as 
a result of efficient cleavage at the Nib/CP cleavage 
site by the 49 kDa protease. 

30 

^pna Cloning and Sequ encing of the ZYMV Genome 

A California isolate of ZYMV was obtained from 
Professor J. A. Dodds of the University of California at 
35 Riverside. The virus was propagated by mechanical 
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inoculation of the cotyledons of ten- day- old Cucurbits 
pepo cv. early straightneck seedlings. Systemically 
infected leaves were harvested 3*8 weeks after 
inoculation and virus was purified therefrom essentially 
5 as described by Lisa, V. , et al., Phvtopathol (1981) 
71:667-672. The virus was quantified by absorbance at 
260 nm using an extinction coefficient of 2.8 A2 60 /mg/ml. 

Viral genomic RNA was isolated from purified 
virions by digestion with protease K in borate buffer (pH 

10 9) containing IV SDS and 4 xnM EDTA for one hour at 37°C, 
followed by phenol/chloroform extraction and ethanol 
precipitation. The RNA was redispersed in water, 
quantified by absorbance at 260 nm, and analyzed by 
agarose gel electrophoresis in the presence of methyl 

15 mercuric hydroxide. 

DNAs complementary to ZYMV RNA were synthesized 
essentially according to Gubler, U. , et al. S2B& (1983) 
2£: 2 63 -269, as described in the technical manual for the 
Riboclone cDNA Synthesis Kit (Pr omega Corp.). Figure 3 

20 shows an outline of this procedure. The first strand was 
synthesized using AMV reverse transcriptase and an 
oligodeoxythyxnidylate primer. After Becond strand 
synthesis EcoRI linkers were added, digested, and the 
cDNAs were ligated into the EcoRI site of pBluescript 

25 (Stratagene, Inc.). The ligation product was then used 
to transform competent JL. coli XL-1 Blue cells 
(Stratagene) which were then plated in the presence of 
lac inducer (IPTG) and substrate (X-gal) for color 
selection of recombinants. Plasmid DNA was isolated from 

30 colorless clones by the alkaline lysis miniprep method 
(Molecular Cloning. A T,ahnratorv Manual. 2d Ed., J. 
Sambrook, E. Fritsch, and T. Maniatis, eds., Cold Spring 
. Harbor Press, New York, 1989) and insert sizes were 
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estimated after digestion with EcoRI by agarose gel 
electrophoresis in the presence of ethidium bromide. 

PZRl, the largest cDNA clone obtained from the 
first experiment, had a 2.3 kb insert, the ends of which 
5 were sequenced using the Sanger dideoxy chain- ter m i n ating 
method as described in the product literature for the 
Seguenase 2 sequencing kit (United States Biologicals) 
with the M13 universal and reverse primers encoded on 
either end of the multiple cloning site in pBluescript. 

10 The orientation of the insert relative to the viral 

genome and the multiple cloning site was indicated by the 
appearance of the polyadenylate tract from the 3 f end of 
the genome in the sequence from the reverse primer. The 
remainder of the clone was sequenced stepwise, 200*300 

15 nucleotides at a time, in both directions from synthetic 
oligodeoxynucleotide primers complementary to the distal 
ends of each of the successive sequencing runs, Sequence 
data were processed and analyzed on a DEC VAX 11/750 
minicomputer. 

20 The second round of cDNA cloning was 

accomplished in the same manner as the first except that 
a synthetic oligodeoxynucleotide complementary to the 5' 
end of pZRl was used as primer and the cDNAs were ligated 
directly, without linkers, into the EcoRV site of 

25 pBluescript (Figure 3) . Prom this cloning two clones 

were obtained, pZBll and pZB60, which had inserts of 2.3 
kb and 3.8 kb, respectively. For the sequencing of 
pZBll, nested deletions were prepared from each end of 
the insert according to Henikoff, S., fifiaa (1984) 

30 2£:357ff as described in the product literature for the 
Erase-a-base System kit (Promega Corp.). For each 
direction, approximately twenty- four clones containing 
deletions spanning the entire length of the insert were 

35 sequenced simultaneously from the M13 universal or 
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reverse primers. Any gaps left by failure of the 
sequences of adjacent time points to overlap were filled 
in using synthetic oligodeoxynucleotide primers made from 
the sequence near the 5' end of the gap. 
5 Clone pZB60 was sequenced in both directions 

from nested deletions in the same manner as for pZBll 
except that a unique Ncol site within pZBll was used as 
the starting point for deletions in the 5' direction and 
only those clones with deletions mapping between the 5' 

10 end of pZBll and the 5' end of pZB60 were sequenced. 

A third round of cDNA cloning was conducted as 
described above for the preparation of pZBll and p2B60 
except that a synthetic oligomer complementary to the 5' 
end of pZB60 was used as a primer. Prom this round, 

15 pZFIS, having an insert of 3.7 kb was obtained and 
sequenced as described above for pZBll and pZB60. 

The 5' end of the viral RNA sequence was 
determined by reverse transcription of purified viral RNA 
using a synthetic oligonucleotide primer complementary to 

20 nucleotides 76-99 at the 5' end of pZF18. The Sanger 
dideoxynucleotide chain- terminating method was used 
essentially as described in the Promega Gem Seq manual 
(Pr omega Corp. ) . 

The continuous open reading frame of the viral 

25 genome was identified with the aid of a computer as 

described above. The coding sequences of the functional 
ZYMV gene products were identified by amino acid sequence 
homology to those of other potyviruses (Allison, R. , et 
al., Virology (1986) 154:9-20; Domier # L.L., et al., 

30 Nucleic Acida Res (1986) 11:5417-5430; Robaglia, C. , et 
al., J Gen Virol (1989) 7^:935-947; Maiss, E., et al., £ 
Gen Virol (1989) 211:513-524). The identity of the coat 
protein gene was further confirmed by subcloning the 

35 presumptive coding sequence into a modified version of 
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pBluescript from which the gene could be expressed in 
vitro . la vitro translation of the gene produced a 
product of the expected size which reacted specifically 
with antiserum raised against purified ZYMV coat protein 
5 when analyzed by Western blotting. 

Figure 4 shows the nucleotide sequence of the 
ZYMV genome as determined above along with the deduced 
amino acid sequence . The nucleotide sequence is numbered 
from the 5' terminus. The 5' non- coding region extends 
10 from nucleotide 1 to nucleotide 139. Nucleotides 140-142 
initiate the polyprotein coding sequence with a 
methionine codon in a consensus translation initiation 
context (Joshi, CP. , Murine Acids Res (1987) i£:6643- 
6653) . By homology with the potyviral polyprotein 
15 sequences cited above, the cleavage site between the 

aphid transmission helper component (HO and the 46 kDa 
protein is believed to occur between the glycine at codon 
766 (nucleotides 2435-2437) and the glycine at codon 767 
(nucleotides 2438-2440) . The cleavage Bite between the 
20 46 kDa protein and the cytoplasmic inclusion protein (CI) 
is believed to occur between the glutamine at codon 1164 
(nucleotides 3629-3631) and the glycine at codon 1165 
(nucleotides 3632-3634) . The cleavage site between CI 
and VPg/pro tease (VPg and protease are probably not 
25 separated in ZYMV) is believed to occur between the 

glutamine at codon 1798 (nucleotides 5531-5533) and the 
serine at codon 1799 (nucleotides 5534-5536). The 
cleavage site between VPg/protease and RNA replicase 
(Rep) is believed to occur between the glutamine at codon 
30 2284 (nucleotides 6989-6991) and the serine at codon 2285 
(nucleotides 6992-6994) . The cleavage site between the 
RNA replicase and the coat protein (CP) is believed to 
occur between the glutamine at codon 2801 (nucleotides 
35 8540-8542) and the serine at codon 2802 (nucleotides 
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8543-8545) . Termination of the polyprotein coincides 
with termination of the coat protein and is believed to 
occur at the stop codon (nucleotides 9380-9382) following 
the glutamine at codon 3080. The 3 f non- coding sequence 
then extends from nucleotide 9383 to nucleotide 9593 
before terminating in a polyadenylate sequence of 
variable length. cDNA clone pZRl contained approximately 
80 adenosines at its 3' terminus. - 

B2EE&5. 

A restriction fragment of 1666 base pairs (bp) 
extending between the PvuII and Sspl sites of ZYMV cDNA 
clone pZBll (described above) was isolated by agarose gel 
electrophoresis and ligated into the Smal site of plasm id 
pTZ18U (Sambrook, J. f et al., Molecular Cloning . Cold 
Spring Harbor Laboratory, 1989; Mead D.A. , et al., 
Protein E ngineering (1986) 1:67). This restriction 
fragment comprises a portion of the coding sequence of 
the ZYMV polyprotein which includes part of the 
cytoplasmic inclusion protein (CI) , the 6 kDa protein, 
the 49 kDa protease, and a portion of the Nib protein. 
Insertion of this fragment into the Smal site of pT218U 
places the reading frame encoding these proteins in phase 
with that of the expressible lacZa gene of pTZ18U such 
that expression of this gene from the lac promoter is 
expected to produce a fusion protein comprised of a small 
portion of the lacZa peptide fused to the amino terminus 
of the ZYMV polyprotein fragment. This construct was 
denoted pZProS and its structure was confirmed by 
dideoxynucleotide sequencing (Sanger, F., et al., Proc. 
Natl. Acad. Sei. USA (1977) 21:5463-5467). 

pZPro$ and pZPrg? 
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The 2280 bp Sail restriction fragment from ZYMV 
cDNA clone pZRl (described above) , which comprises a 
portion of the ZYMV genome including part of the Nib 
protein, CP, the 3' non- coding sequence, and a portion of 
5 the polyadenylate sequence, was inserted into the Sail 
site of pZProS to create pZPro6. Dideoxynucleotide 
sequencing of pZProS confirmed that the Nib/CP- encoding 
reading frame of the inserted fragment was in phase with 
the open reading frame of pZProS such that expression of 

10 this construct from the lac promoter is expected to 

produce a single polypeptide of approximately 140 kDa. 

In a further refinement of pZProfi, the 1244 bp 
Mlul-Nael fragment was removed and the 705 bp MluI-BcoRV 
fragment from pZRl was inserted in its place, removing 

15 most of the ZYMV 3' non- coding and polyadenylate 

sequences, which include several unwanted restriction 
sites* This construct was denoted pZPro7. ££li 
strain DH5<* was transformed with pZPro6 and pZPro7, and 
transformed clones were identified and isolated by 

20 selection for anpicillin resistance. 

Expression of the lacZa-ZYMV polyprotein gene 
in pZPro6 *nH pZPro7 was monitored by immunoblotting of 
SDS /PAGE -resolved proteins from these cells using 
polyclonal antisera raised in rabbits against denatured 

25 ZYMV coat protein {Burnette, W.N. , Anal, Biochem. (1981) 
112 :195) . Results are shown in Figure SB. Extract from 
cells harboring either pZProS or pZPro7 contained a 
single imnrunoreactive band which co-migrated with the 
major species of mature ZYMV coat protein at 

30 approximately 31 kDa. Since the CP-containing primary 
translation product of 140 kDa was not detected in these 
extracts, the exclusive appearance of mature CP implies 
correct and efficient processing of the polyprotein by 

35 the ZYMV 49 kDa protease. 
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To rule out the possibility that mature CP 
mighw have been produced either by an endogenous JL. coli 
protease, or by fortuitous initiation of translation near 
the amino terminus of mature CP, a variant of pZPro7 was 
5 also analyzed. This variant, denoted placZof-CP, was made 
by deleting the sequence encoding the protease and most 
of the Nib protein from pZPro7, leaving part of the Nib 
protein and CP in phaBe with the lacZa peptide in a 46 
kDa open reading frame (see Figure 5A) . Extracts from 

10 cells harboring this construct contained a single 

immunoreactive band which migrated with an apparent MW of 
46 kDa (Figure 5B, lane 2) . The apparent absence in 
these cells of a species co-migrating with mature CP in 
the absence of the ZYMV 49 kDa protease indicates that 

15 the activity of the latter is indeed responsible for the 
occurrence of mature CP in cells harboring pZPro6 and 
pZPro7 . 

20 construction and analysis gf qenea vhten gQnfer * 

negative p henotvoe on B- coli cells bv virtue of the 
activity of the ZYMV 49 kDa protease according to the 
scheme described above for Protease Inhibitor Selection 
System I. 

25 This example describes the construction of 

expressible genes encoding polyproteins which contain the 
ZYMV 49 kDa protease and the £Qli ribosomal protein 
S12. The ability of these gene constructs to confer 
sensitivity to the antibiotic streptomycin on several 

30 streptomycin- resistant £L sail strains by virtue of 

correct and efficient processing of the polyprotein by 
the ZYMV 49 kDa protease is demonstrated. 

Streptomycin lethality in ^ cgli has been 

35 ascribed to its ability to interfere with protein 
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synthesis by binding to the S12 subunit of the 3 OS 
compon en t of the ribosame (Gorini, L. f in Ribpggmes , M. 
Nomura, A. Tissieres, P. Lengyel, eds., Cold Spring 
Harbor Laboratory, 1974, pp. 791-803). Streptomycin- 
resistant mutants have been isolated which express 
altered forms of S12 which retain the ability to 
participate in the assembly of ribosames which can 
function in the presence of streptomycin. Wildtype S12 
has been shown to confer a d om inant streptomycin- 
sensitive phenotype on merodiploids which express both 
wildtype and streptomycin- resistant forms of S12. The 
highly sequestered position of S12 in the ribosome 
suggests that S 12 -containing polyproteins should be too 
encumbered to participate in the assembly of functional 
ribosomes. Thus, the expression of such polyproteins in 
streptomycin- resistant hosts should be unable to confer 
streptomycin sensitivity unless mature S12 can be 
proteolytically freed from the polyprotein. 

The CP- encoding sequences in pZPro7 were 
precisely replaced with the coding sequence for S12 to 
create pZPro9. This was accomplished as follows. The 
sequence bounded by the Bglll site in Nib and the PI' 
position of the Nib/CP cleavage site in pZPro7 (Schecter, 
I., and A. Berger, fiioehem. Biophvs. Res, Conmun. (1967) 
27:157) was amplified by polymerase chain reaction (PCR, 
Saiki, R.K., et al., fifiisnca (1988) 222:487). The S12 
coding sequence from the second amino acid to the end was 
amplified by PCR from plasmid pN01523 (Dean, D., Sens 
(1981) 1£:99-102) . The 3' primer contained an Mlul site 
following the stop codon. Following cleavage of the 
first PCR product by Bglll and the second by Mlul both 
were simultaneously ligated into pZPro7 from which the 
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Bglll-Mlul fragment had been removed. Following 
transformation and plasmid DNA purification, the 
structure of pZPro9 (see Figure 6) was confirmed by 
dideoxynucleotide sequencing. 
5 pZPro9 was then transformed into streptomycin- 

resistant JEL. fifili strains MC1009, HB101, and N100 
(American Type Culture Collection Catalogue of Bacteria 
and Phages, 1989) . After two rounds of single colony 
isolation in the presence of ampicillin (amp) , single 

10 colonies of each transfozmant were grown in Luria-Bertani 
medium (LB) containing 50 pg /ml amp to mid- log phase and 
plated on solid LB containing 100 fig /ml amp, 100 fig /ml 
streptomycin (strep) , or both. 

Consistently, fewer than one in 10* amp- 

15 resistant colony- forming units (cfu) of each trans formant 
was observed to grow in the presence of both amp and 
strep, while the same hosts harboring pZPro7 plated with 
similar efficiencies on amp alone or amp and strep (see 
Figure 6) . Thus, by virtue of having S12 in place of CP, 

20 pZPro9 is able to confer strep sensitivity on strep- 
resistant hosts, while its CP* containing parent, pZPro7, 
is not. The pZPro9 transf ormants were fully sensitive to 
as little as 3 fig strep/ml while the parent strains were 
fully resistant to up to 350 ji/ml. Also, the pZPro9 

25 transf ormants plated equally well on strep alone or amp 
alone, indicating that pZPro9 is quickly lost in the 
absence of amp selection and that there is no discernible 
tendency to replace the strep- resistant gene in the host 
chromosome with the S12 gene by homologous recombination. 

30 To confirm that cleavage of the polyprotein by 

the ZYMV 49 kDa protease to liberate mature S12 is 
required for strep sensitivity, the Nib/CP cleavage site 
was removed from pZPro9 to create pZProl2, which should 

35 produce a polyprotein from which S12 cannot be freed by 
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the protease. This was accomplished by cleaving pZPro9 
with EcoRV and Hpal, which removed most of the Nib 
protein including the Nib/CP cleavage site, and replacing 
it with the fragment produced by EcoRV alone, which 
5 restored most of the Nib protein down to within 12 amino 
acids of the Nib/CP cleavage site (see Figure 6) . The 
structure of pZProl2 was confirmed by dideoxynucleotide 
sequencing. Transformation with pZProl2 has no ? 
discernible effect on the ability of strep -resistant 

10 coli strains to grow vigorously in the presence of up to 
350 fig/ml streptomycin. Thus, the strep- sensitive 
phenotype produced by pZPro9 is completely dependent on 
the presence of a substrate cleavage site at which the 
protease ca n cleave functional S12 from the polyprotein. 

15 The expression of the pZPro9 polyprotein, like 

most large eucaryotic proteins, places a considerable 
burden on growing i. sflli cells. In an attempt to reduce 
this burden, the pZPro9 was streamlined by removing the 
EcoRV fragment described above, which contains most of 

20 the Nib protein exclusive of the cleavage sites at either 
end. This construct, denoted pZProlO, encodes a 
polyprotein of about 83 kDa, of which 49 kDa is the 
protease and about 14 kDa is S12 (see Figure 6) . Upon 
transformation with pZProlO, strep- resistant fL. £2li 

25 strains displayed a strep- sensitive phenotype identical 
to that of the pZPro9 transf ormants . In addition, 
pZProlO transf ormants grew considerably more vigorously 
than pZPro9 transf ormants, indicating a significant 
reduction in the metabolic burden on the host cells. 

30 Thus, removal of most of the Nib protein had no 

discernible effect on the efficiency of removal of 
functional S12 from the polyprotein by the protease. 

However, pZProlO- expressing cells still grew 

35 poorly compared to the untransformed host. Recent work 
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with another viral protease suggests that- this is 
probably due, at least in part, to fortuitous activity of 
the protease on host proteins (Baum, E.Z., et al., Proc. 
Natl. Aead- Sci- USA (1990) fll:5573-SS77) . Inhibitors of 
5 the protease should at least partly restore normal 

growth. As this growth differential is relatively easy 
to score, it is possible to use the toxicity of the 
protease as the negative phenotype, and to select 
inhibitors from peptide libraries, or to confirm selected 
10 inhibitors on the basis of their ability to restore rapid 
growth. 

Once the protease removes itself from the 
polyprotein of pZProlO, the 14 kDa S12 is left in a 21.5 
kDa precursor until freed by the protease. To confirm 

15 that neither this precursor nor the polyprotein itself is 
able to confer strep sensitivity, the Nib/CP cleavage 
site was removed from pZProlO in the same manner that the 
EcoKV-Hpal restriction fragment was removed from pZPro9 
to create pZProl2. This new construct, pZProll, shown in 

20 Figure 6, had no discernible effect on the level of strep 
resistance shown by strep- resistant JL. fifili strains. 
Thus, again, the presence of a substrate cleavage site 
adjacent to S12 is required for strep sensitivity, 
implying that the ZYMV 49 kDa protease is specifically 

25 responsible for generating functional S12. 

Thus, systems for identifying and selecting 
protease inhibitors from peptide libraries have been 
disclosed. Although preferred embodiments of the subject 
invention have been described in some detail, it is 

30 understood that obvious variations can be made without 
departing from the spirit and the scope of the invention 
as defined by the appended claims. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AAAATTGAAA CAAATCACAA AGACTACAAG AATCAACGAT CAAGCAAACC AATTTTTGAA 60 

CGTATTTACA AACAAGCAAT CTAAAACTCT TACAGTATTA AGAAATTCTC CAATCACTTC 120 

GTTTACTTCA GACATAACAA TGCCCTCCAT CATGATTGCT TCAATCTCTG TACCCATTGC 180 

AAAGACTGAG CAGTCTGCAA ACACTCAAGT AACTAATCGG CCTAATATAG TCGCACCTGG 240 

CCACATGGCA ACATGCCCAT TGCCACTGAA AACGCACATG TATTACAGGC ATGAGTCCAA 300 

GAAGTTGATG CAATCAAACA AGAGCATTGA CATTCTGAAC AACTTCTTCA GCACTGAOGA 360 

GATGAAGTTT AGGCTCACTC GAAACGAGAT GAGCAAGCTG AAAAAGCGTC CGAGCGGGAG 420 

GATAGTCCTC CGCAAGCCCA GTAAGCAGCG GGTTTTCGCT CGTATCGAGC AGGATGAGGC 480 

AGCACGCAAG GAAGAGGCTG TTTTCCTCGA AGGAAATTAT GACGATTCCA TCACAAATCT 540 

AGCACGTGTT CTTCCACCTG AAGTGACTCA CAACGTTGAT GTGAGCTTGC GATCACCGTT 600 

TTACAAGCGC ACATACAAGA AGGAAAGGAA GAAAGTGGCG CAAAAGCAAA TTGTGCAAGC 660 

ACCACTTAAT AGCTTGTCCA CACCTGTTCT TAAAATTGCA CGCAATAAAA ATATCCCTGT 720 

TGAGATGATT GGCAACAAGA AGGCGAGACA TACACTCACC TTCAAGAGGT TTAGGGOATG 780 

TTTTGTTGGA AAGGTCTCAG TTGCGCATGA AGAAGGACGA ATGCGGCACA CTGAGATGTC 840 

GTATGAGCAG TTTAAATGGC TTCTTAAAGC CATTTCTCAG GTCACCCATA CAGAGCGAAT 900 

TCGTGAGGAA GATATTAAAC CAGCTTGTAG TGGGTCGGTG TTGGGCACTA ATCATACATT 960 

GACTAAAAGA TATTCAACAT TGCCACATTT GGTGATTCGA CGTAGAGACG ACGATGGGAT 1020 

TGTGAACGCC CTGGAACAGC TCTTATTTTA TAGCGAAGTT GACCACTATT CGTCGCAACC 1080 

GCAAGTTCAG TTCTTCCAAG GATGGCGACG AATGTTTGAT AAGTTTAGGC CTAGCCCAGA 1140 
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TCATCTGTGC AAAGTTCACC ACAACAACGA GGAATGTCGT GAGTTAGGAG CAATCTTTTG 1200 

TCAGGCTCTA TTCCCACTAG TGAAACTATC GTGCCAAACA TGCAGAGAAA AGCTTAGTAG 1260 

AGTTAGCTTT GAGGAATTCA AAGATTCTTT GAACGCAAAC TTTATTATCC ACAAGGATGA 1320 

5 

ATGGGGTAGT TTCAAGGAAG GCTCTCAATA CCATAATATT TTCAAATTAA TCAAAGTGGC 1380 

AACACAGGCA ACTCAGAATC TCAAGCTCTC ATCTGAAGTT ATGAAATTAG TTCAGAACCA 1440 

? 

CACAAGCACT CACATCAACC AAATACAAGA CATCAATAAG GCGCTCATGA AAGGTTCATT 1500 

10 

GGTTGCGCAA GACGAATTGG ACTTAGCTTT GAAACAGCTT CTTGAAATGA CTCAGTGGTT 1560 



TAAGAACCAC ATGCACCTGA CTGGTGAGGA GGCATTGAAO ATGTTCAGAA ATAAGCGTTC 1620 



TAGCAAGGCC ATGATAAATC CTACCCTTCT ATOTGGCAAC CAATTGGACA AAAATGGAAA 1680 

15 

TTTT G ITT G G GGAGAAAGAG GATACCATTC CAAOCOATTA TTCAAGAACT TCTTCGAAGA 1740 

AGTAATACCA AGOGAAGGAT AT ACQ AAGT A CGTAGTGCGA AACTTTCCAA ATGGTACTCG 1800 

TAAGTTGGCC ATAGGCTCAT TGATTGTACC ACTTAATTTG GATAGGGCAC GCACTGCACT 1860 

20 

ACTTGGAGAG AGTATTGAGA AGAAGCCACT CACATCAGCG TCTCTCT CC C AACAGAATGG 1920 

AAATTATATA CACTCATOCT GCTOTCTAAC GATGGATGAT CG AACCCCG A TGTACTCCG A 1980 

GCTXAAGAGC COGAOGAAGA GGCATCTAGT TATAGGAGCT TCTAGTGATC CAAAOTACAT 2040 

25 

TGATCTGCCA GCATCTGAGG CAGAACGCAT GTATATAGCA AAGGAAGGTT ATTCCTATCT 2100 

CAGTATTTTC CTCGCAATGC TTCTAAATCT TAATGAGAAC GAAGCAAAGG ATTTCACCAA 2160 

AATGATTCGT GATGTTTTGA TCCCCATGCT TGGGCAGTGG CCTTCATTGA TGGATGTTGC 2220 

30 

AACTGCAGCA TATATTCTAO GTGTATTCCA TCCTGAAAOG CGATGCGCTG AATTACCCAG 2280 

GATCCTTGTT GACCACGCTA CACAAACCAT GCATGTCATT GATTCTTATG GATCACTAAC 2340 

35 

TGTTGGGTAT CACGTGCTCA AGGCTGGAAC TGTCAATCAT TTAATTCAAT TTGCCTCAAA 2400 
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TGATCTGCAA AGCGAGATGA AACATTACAG AGTTGGTGGG ACACCAACAC ACCCCATTAA 2460 

ACTCOAOCAO CAOCTCATTA AAGCAATTTT CAAACCAAAA CTTATGATGC ACCTCCTCCA 2520 

TGATGACCCA TACATATTAT TACTTGGCAT GATTTCACCC ACCATTCTTG TACATATGTA 2580 

TAOCATCCCT CATTTTCACC GGGGTATTGA GATATGGATT AAGAGGGATC AT6AAATCGG 2640 

AAAGATTTTC GTCATATTAG AOCACCTCAC ACGCAACCTT GCTCTCCCAC AACTTCTTCT 2700 

GGATCAACTT AACTTCATAA CTCAAGCTTC ACCACATTTA CTTOAAATTA TGAACCGTTC 2760 

TCAAGATAAT CAGAGGGCAT ACGTACCTGC GCTGGATTTG CTAACGATAC AAGTGGAGCG 2820 

TGAGTTTTCA AATAAA6AAC TCAAAACCAA TGGCTATCCA GATTTGCAGC AAACGCTCTT 2880 

CGATATGAGG GAAAAAATGT ATGCAAAGCA GCTGCACAAT TCATGGCAAG ACCTAAGCTT 2940 

GCTGGAAAAA TCCTGTGTAA CCGTGCGATT CAAGCAATTC TCGATTTTTA CGGAAAGAAA 3000 

TTTAATCCAG CGAGCAAAAG AAOCAAAOCG CGCATCTTCG CTACAATTTG TTCAOOAGTO 3060 

TTTTATCACG ACCCGAGTAC ATGCGAAGAG CATTCGCGAT GCAGGCGTGC GTAAACTAAA 3120 

TGAGGCTCTC GTCGGAACTT GTAAATTCTT TTTCTCTTGT GGTTTCAAAA TTTTTGCGCG 3180 

ATGCTATAGC GACATAATAT ACCTTGTGAA CGTGTOTTTO GTTTTCTCCT TGGTGCTACA 3240 

AATGTCCAAT ACTOTGOGCA GTATGATAGC AGCGACAAGG GAAGAAAAAG AGAGAGOGAT 3300 

GGCAAATAAA GCTGATGAAA ATGAAAGGAC GTTAATGCAT ATCTACCACA TTTTCAGCAA 3360 

GAAACAGGAT GATGCGCCCA TATACAATGA CTTTCTTCAA CATCTOCGTA ATGTGAGACC 3420 

AGATCTTGAG GAAACTCTCT TCTACATGGC TGGCGTAGAA GTTGTTTCAA CACAGCCTAA 3480 

GTCAGCGGTT CAGATTCAAT TCGAGAAAAT TATAGCTGTG TTGCCGCTCC TTACCATGTG 3S40 

CTTTGACGCC GAAAGAACCG ATCCCATTTT CAAGATTTTG ACAAAACTCA AAACAGTTTT 3600 

TGGTACGGTT GGAGAAACGC TCCGACTTCA AGGGCTTGAA GACATTCAAA GCTTGGAGGA 3660 
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CGATAAAAGA CTCACAATTG ATTTTGATAT TAACACGAAC GAGGCTCAAT CGTCAACAAC 3720 
ATTTCATCTC CATTTT O ATG ACTG6TCGAA TCGGCAACTA CAGCAAAATC CCACACTTCC 3780 



CAATOAAATA CCATCATCAA GTGAGGGAGA GTTCTTAGTT AGAGOAGCAC TAGGTTCTGG 3900 



AAAATCAACG AGCTTACCTG CACATCTTGC CAAGAAGGGT AAGGTGTTAC TACTCGAACC 3960 



TACACGCCCT TTGGCGGAGA ATGTTAGTAG ACAGTTAGCA GGTOATCCTT TCTTTCAAAA 4020 

10 

CGTTACACTC AGAATGAGAG GGTTAAOTTG TTTTGGTTCA AOCAATATTA CAGTGATGAC 4080 



GAGTGGATTT GCTTTTCACT ACTATGTTAA CAATCCACAT CAATTGATGG AATTTGACTT 4140 



TGTCATCATA GACGAGTCCC ATGTCACAGA CAGTGCGACC ATAGCTTTCA ATTGTGCACT 4200 

15 

TAAAGAGTAC AACTTTGCTG GGAAATTGAT TAAAGTGTCT GCAACGCOGC CAGGGAGAGA 4260 



GTGCGATTTC GATACGCAAT TCGCGGTGAA AGTCAAAACA GAGGACCATC TTTCATTCCA 4320 



TOCATTCGTT GGCGCACAGA AGACTGGTTC AAATGCTGAC ATGGTTCAGC ATGGTAATAA 4380 

20 

CATACTTGTG TATGTTCCAA GTTACAACGA AGTGGACATG CTCTCTAAGT TACTCACTGA 4440 



GCCCCAATTT TCAGTTACAA AGGTAGATGG GCGAACAATG CAGCTTGGAA AAACTACCAT 4500 



TGAAAOGCAT GGAACTAGCC AAAAGCCCCA TTTCATACTA GCTACAAACA TCATCGAGAA 4560 

25 

TGGAGTGACG TTGGATGT7G AGTGTGTTGT TGATTTTOGA CTAAAAGTGG TCGCAGAACT 4620 



GGACAGCGAA AATCGCTGTG TGCGCTACAA TAAGAAATCA GTTAGTTATG GAGAGAGGAT 4680 



TCAGCCACTA GGAAGAGTGG GGAGATCTAA GCCTCGAACT GCATTGCGTA TAGGGCACAC 4740 

30 

AGAAAAAGGC ATCGAAACGA TTCCTGAATT CATTGCCACA GAAGCAGCAG CCTTATCATT 4800 

TGCATATGGG CTTCCAGTCA CCACACATGG ACTTTCCACA AATATACTTG GAAAGTGGAC 4860 

35 

AGTTAAACAG ATGAAATCTG CTTTGAACTT TGAGCTAACT CCTTTCTTCA CCACTCATTT 4920 



ACATTACAGG ACCACAGGCA AATTCCTTGA ATTTACCAGA AATACTGGAG CTTTTGTGGC 
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AATCCQTCAT OATOGTAOTA TCCATCCACT AAIACACOAA GAATTOAAGC AOTTCAAACT 4980 

CAGOGATTCA OAAATCGTGC TCAACAACGT TGCATTACCT CATCAATTTC TGAGCCAATG 5040 

CATOCATCAA A6T0A6TAT0 AACCCATTCG AOTOCACOTT CAATGCCATO AOAGCACACC 5100 

CATACcrrrr tacacaaatg gaaiacctga taaagtctat gagagaatet ggaagigcat sieo 

ACAAGAAAAC AAGAACGATG COGTTTTTGO IAAGCTTTCA ACTGCTTGTT CAACTAAOGT S220 

TAGTTATACA CTTAGCACTG ATCCAGCAGC ATTACCCAOA ACtAHOCAA TCATCGATCA 5280 

CCTOCTTGCC GAGGAAATGA TGAAGCGGAA TCACTTCGAC ACXATCAGCT CAGCTGTAAC 5340 

GGGCTATTCA TTTTCCCTTG CTGGAATZGC TGATTCTTTC AGGAAGAGAT ACATGOGCGA 5400 

TTACACAGCG CACAACATTG CAATTCTGCA ACAAOCACGT GCCCAGCTGC TTOAATTTAA 5460 

IAGTAAGAAT GTGAACATTA ACAATCTGTC OGATTTAGAA GGAATTGGAG TCATTAAGTC S520 

GGTGGTGTTG CAAAOTAAOC AAGAGGTCAG CAOTTTCCTC OGACTTCOCO GTAftATGGGA 5580 

TGGAAAGAAA TTTGCGAATO ATGTGATATT GGCGATTATG ACACTCTTAG GAGGTGGGTG 5640 

GTTCATGTGG GAATACITCA OGAAAAAGAT CAATGAACCC GTGCGCGTTG AAAGCAAGAA S700 

AOGTCGAICT CAAAAAXTGA AAXTCAGGGA TCCGTACGAT AGAAAAGTTG GACOTOAOAI 5760 

TTTTGGTGAT GATCATACAA ITGGGCGCAC TTtCGGCGAA GCTTACAOOA AGAGAGGAAA 5820 
CGTCAAAGGA AACAACAACA CAAAAOGAAT GGGAOGGAAA ACTOGCAATT TTOTGCATTt 
ATATGGTGTG GAGCCTGAGA ATTACAGTTT TAICAGATTT GTGGACCCTC TCACTGGCCA 
TACATTGGAC GAAAGCACCC ATACAGACAT ATCGITAGTG CAGOAGGAGT TTGGAAGTAI 
TAGAGAGAAA TTTCTGGAGA ATGATTTGAT CTCGAGGCAG TCTATTATCA ACAAACCCGG 

CATTCAGGCA TATTTTATGG GCAAGGGCAC TGAAGAAGCA CTCAAA6ITG ACTTGACTCC 6120 

35 TCATGTACCA TTGCTTCTGT CCAGAAACAC CAATGCTATT GCGGGATACC CAGAGAGAGA 6180 
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XAATGACTTC AGACAAACTG GCACACCAGT CAAOGTTT C T TTTAAAGACG TGCCACAGAA 6240 

AAACGAACAT GTCGACTTGG AGAGCAAATC TATCTACAAA GGAGTGCGCG ATTACAATGG 6300 

CATCTCAACA ATCGTTTGTC AATTAACGAA CGATTCTGAT GGCCTCAAGG AGACCATGTA 6360 

5 

TGGTATTGGC TATGGGCCAA TAATCATCAC TAATGGACAC CTCTTCAGGA AAAACAATGG 6420 

CACA C TT C T A GTCAOGTCTT GGCATGGTGA ATTCATTGTT AAAAATACCA CAACGCTCAA 6480 

AGTGCATTTC ATAGAAGGOA AGGATGTCGT GTTAGTGCGC ATGCCAAAGG ACTTTCCGCC 6540 

10 

GTTTAAAAGC AACGCTTCTT TTAGGGCACC AAAACGCGAG GAACGACGAT CCTTGGTTGC 6600 



GACAAACTTT CAAGAAAAGA GTCTTCGCTC CACTGTTTCG GAATCTTCCA TGACAATACC 6660 

TGAAGGAACT GGCTCATATT GGATACATTG GATTTCGACC AACGAAGGGG ATTGCGGATT 6720 

15 

GCCCATGGTT TCAACAACGG ATOGCAAGAT AATTGGAGTT CATGGTTTGC CTTCCACAGT 6780 



CTCATCTAAG AATTATTTTC TCCCATTCAC TGATGATTTT ATAOCCACOC ATTTOAGCAA 6840 

ACTTGATGAC CTCACATOGA CTCAGCATTG GCTATGGGAA CCTAGCAAAA TTGCGTGGGG 6900 

20 

AACGCTCAAC TTAGTTGATO AACAACCAGG GCCCGAATTT CGTATCTCAA ATCTAGTCAA 6960 

GGATTTATTC ACTTCTGGTG TTGAAACACA GAGGAAGCGA GAAAGATGGG TCTAOGAAAG 7020 

CTGTGAAGGG AACCTTCGGG CT G TT G GAAC TGCACAATCA CCOTTAGTCA CCAAACATGT 7080 

25 

TGTGAAAGGC AAOTGTCCTT TCTTCGAAGA ATATTXACAA ACACACGCAG A A QCG A GCGC 7140 

CTATTTCAGA CCCCTAATGC GAGAGTACCA GCCGAGGAAG TTGAACAAAG AAGCCTTTAA 7200 

AAAGGATTTC TTTAAATACA ATAAACCCGT CACTGTTAAC CAACTGGATC ATOATAAATT 7260 

30 

TTTGGGAGCA GTGGATGGGG TTATACGTAT GATGTGTOAT TTTGAGTTCA ACGAATGTCG 7320 

ATTCATTACA GATCCCGAGG AAATTTACAA CTCTTT G AAC ATGAAAGCAG CAATTGGAGC 7380 

35 

CCAGTATAGA GGAAAGAAGA AAGAGTATTT TGAGGGGCTA GATGATTTTG ATCCAGAGCG 7440 
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ACTTTTATTC CAAAGTTGTG AAACCTTGTT CAATGGCTAC AAAGGTCTGT GGAATGGATC 
TTTAAAGGCC GAGCTCAGGC CGCTTCAGAA AGTCAGGGCT AACAAAACAC CAACCTTTAC 
ACCACCCCCA ATTCATACAT TCCTTCGACC TAAAOTTTCT GTGGATGATT TCAACAATCA 

5 

CTTCTACAGC AAAAACCTCA AGTGTCCATC GACGGTCGGC ATGACAAAAT TTTATCCTCC 
TTCCCATAAA TTGATGACAT CATTACCTCA TGCTTCCTTC TATTCTCATG CTGAT^SATC 
ACACTTCCAT ACTTCCTTAA CCCCACCCTT ACTGAACGCA CTGCTCATAA TCACOTCATT 

10 

TTATATGGAC GATTCGTGCC TCGGCCAAGA GATGCTTGAA AATCTTTATG CCGAGATTGT 
GTACACTCCA ATTCTTGCTC CTGATGCAAC AATTTTCAAG AAATTTAGAG GTAACAACAG 
TGGGCAACCC TCAACAGTGG TGGATAACAC ACTAATGGTT GTCATCTCTA TTTACTATCC 

15 

GTGCATGAAA IIT GG TTCGA ACTGCGAGGA GATTCAGAAT AAACTTGTC? TCTTTGCAAA 
TGGAGATGAT CTGATACTTG CAGTCAAAGA TGAGGATAGC GGCTTACTTG ATAACATGTC 
ATCCTCTTTT TGCGAACTTG GACTGAATTA TGATTTTTCA GAACGTACGC ATAAAAGAGA 



AGATCTTTGG TTCATGTCCC ACCAAGCAAT CCTAGTTCAT GGAATGTACA CTCCAAAACT 
CGAGAAAGAG AGAATTCTTT CAATTCTAGA GTCGGATAGA AGCAAAGAAA TTATGCACOG 
AACAGAGGCT ATTTGCGCTG CGATGATTGA GGCATGGGGG CACACCGAGC TCTTGCAAGA 

25 

AATCAGAAAG TTTTACCTAT GGTTCGTTGA AAAAGAAGAG GTCCGAGAAT TGGCAGCCCT 
CGGAAAAGCT CCAXACATAG CTGAGACAGC ACTTCGTAAG TTATACACTG ACAAGGCAGC 
AGATACAAGT GAACTGGCAC GCTACCTACA ACCCCTCCAT CAAGATATCT TCTTTCAGCA 

30 

AGGAGACACT GTGATGCTCC AAXCAGGCAC TCAGCCAACT GTGGCAGATG CTGGAGCTAC 
AAAGAAAGAT AAAGAAGATG ACAAAGGGAA AAACAAGGAC GTTACAGGCT CCGCCTCAGG 
35 TGAGAAAACA GTAGCAGCTG TCACGAAGGA CAAGGATGTG AATGCTGGTT CTCATGGGAA 
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AATTGTGCCG CGTCTTTCGA AGATCACAAA GAAAATCTCA TTGCCACGCG TGAAAGGAAA 8760 



TCTGATACTC GATATTGATC ATTTGCTGGA ATATAAACCG GATCAAATTG AGTTATATAA 8820 

CACACGAGCG TCTCATCAGC AGTTCGCCTC TT GG TT CA AC CAGGTTAAGA OGGAATATGA 8880 

5 

TTTGAACGAG CAACAGATGG GAGTTGTAAT GAATGGTTTC ATGGTTTGGT GCATTGAGAA 8940 



TGOCACTTCA CCCGACATTA ATGGAGTGTG GGTTATGATG GACGGAAATG AGCAAGTTGA 9000 

CTATCCCTTG AAACCAATAG TTGAAAATGC AAAGCCAACG CTGCGGCAAA TAATGCATCA 9060 

10 

TTTTTGAOAT GCAGCGGAGG CATATATAGA GATGAGAAAT GCAGAGGCAC CATACAT6CC 9120 



GAGGTATGGT TTGCTTCGAA ACCTAOGGGA TAGGAGTTTA GCACGATATG CTTTTGATTT 9180 



CTATGAAGTC AATTCTAAAA CTCCTGAAAG AGCCCOCGAA O C T O TT G COC AGATGAAACC 9240 

15 

AGCAGCTCTT AGCAATGTTT CTTCAAGGTT CTTTGGCCTT GATGGAAATG TTGCCACCAC 9300 



TAGCGAAGAC ACTGAACGGC ACACTGCACG TGATGTTAAT AGAAACATGC ACACCTTACT 9360 



AGCTGTGAAT ACAATOCAGT AAAGGCTAGG CCGCCTACCT AGGTTATTCT TTCGCTGCCG 9420 

20 

ACGTAATTCT AATATTTACC GCTTTATTTG ATATCTTTAG ATTTCCAGAG TGGGCCTCCC 9480 



ACCTTTAAAG CGTAAAGTTT ATGTTAGTTG TCCAGGAGTG CCGTAGTCCT TTOGGAAGCX 9540 



TTAGTGTGAG CCTCTCACCA ATAAGCTCGA GATTAGACTC OGTTTGCAAG CCT 9593 

25 



30 



35 
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Claims 

1. A method for detecting a protease 
inhibitor , said method comprising: 

5 (a) providing a population of host cells 

expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 
10 protease; 

(b) providing a pool of nucleic acid 
constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 
inhibitor of said protease; 

15 (c) transforming said host cells of (a) with 

said nucleic acid constructs of (b) ; and 

(d) growing said transformed host cells of (c) 
under conditions that distinguish cells with said 
selectable phenotype, thereby detecting the presence of 

20 said protease inhibitor. 

2. The method of claim 1 wherein said host 
cells are bacterial cells. 

25 3. The method of claim 2 wherein said 

selectable phenotype is the ability of said bacterial 
cells to grow in the presence of a given antibiotic. 

4. The method of claim 3 wherein said second 
30 nucleic acid sequence comprises a nucleic acid sequence 
encoding coli ribosomal protein S12 and said 
antibiotic is streptomycin. 

35 
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5. The method of claim 2 wherein said first 
nucleic acid sequence comprises a nucleic acid sequence 
encoding ZYMV 49 kDa protease. 

5 6. The method of claim 2 wherein said 

selectable phenotype is the ability of said transformed 
host cells to grow in the presence of a given carbon 
source. 

10 7. The method of claim 1 wherein said second 

nucleic acid sequence comprises a nucleic acid sequence 
encoding a protein which is inactivated by said protease. 

8. The method of claim 1 wherein said second 
15 nucleic acid sequence comprises a nucleic acid sequence 

encoding a protein which is activated by said protease. 

9. A DNA construct comprising: 

(a) a first DNA coding sequence for a protein 
20 capable of conferring a selectable phenotype on a host 

cell transformed therewith, said selectable phenotype 
dependent on the activity of a protease; and 

(b) control sequences that are operably linked 
to said first and second coding sequences whereby said 

25 coding sequences can be transcribed and translated in a 
host cell, and at least one of said control sequences is 
heterologous to at least one of said coding sequences. 

10. The DNA construct of claim 9 further 
30 comprising a second DNA coding sequence for said 

protease • 



35 



11. The DNA construct of claim 10 wherein said 
protease is ZYMV 49 kDa protease. 
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10 



15 



20 



25 



30 



12. The DKA construct of claim 9 wherein said 
first DNA coding sequence codes for sail ribosomal 
protein S12 and said selectable phenotype is streptomycin 
resistance. 

13. The DNA construct of claim 10 wherein said 
first DNA coding sequence codes for fij. sail ribosomal 
protein S12 and said selectable phenotype is streptomycin 
resistance . 

14. The DNA construct of claim 11 wherein said 
first DNA coding sequence codes for 2*. sail, ribosomal 
protein S12 and said selectable phenotype is streptomycin 
resistance. 

15. A DNA construct comprising: 

(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 
said protein identified by the method of claim 1; and 

(b) control sequences that are operably linked 
to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
least one of said control sequences is heterologous to at 
least said coding sequence. 

16. The DNA construct of claim 15 wherein said 
protease is ZYMV 49 IcDa protease. 

17. A host cell stably transformed with a DNA 
construct according to claim 9. 

18. A host cell stably transformed with a DNA 
construct according to claim 10. 



35 
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19. The host cell of claim 18 further 
transformed with a DNA construct comprising: 

(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 

5 said protein identified by a method comprising 

(i) providing a population of host cells 
expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 

10 said host cells dependent on the activity of said 
protease; 

(ii) providing a pool of nucleic acid 
constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 

15 inhibitor of said protease; 

(iii) transforming said host cells of (i) 
with said nucleic acid constructs of (ii); and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 

20 said selectable phenotype, thereby detecting the presence 
of said protease inhibitor; and 

(b) control sequences that are operably linked 
to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 

25 least one of said control sequences is heterologous to at 
least said coding sequence. 

20. A host cell stably transformed with a DNA 
construct according to claim 11. 

30 

21. The host cell of claim 20 further 
transformed with a DNA construct comprising: 
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(a) a DNA coding sequence for a protein 
capable of inhibiting the action of a given protease, 
said protein identified by a method comprising 

(i) providing a population of host cells 
5 expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 
protease; 

10 (ii) providing a pool of nucleic acid 

constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding am 
inhibitor of said protease; 

(iii) transforming said host cells of (i) 
15 with said nucleic acid constructs of (ii); and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 
said selectable phenotype, thereby detecting the presence 
of said protease inhibitor; and 

20 (b) control sequences that are operably linked 

to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
least one of said control sequences is heterologous to at 
least said coding sequence. 



25 



22. A host cell stably transformed with a DNA 
construct according to claim 12. 



23. A host cell stably transformed with a DNA 
30 construct according to claim 13. 

24. The host cell of claim 23 further 
transformed with a DNA construct comprising: 

35 
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(a) a DHA coding sequence for a protein 
capable of inhibiting the action of a given protease, 
said protein identified by a method comprising 

(i) providing a population of host cells 
5 expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 
said host cells dependent on the activity of said 
protease; 

10 (ii) providing a pool of nucleic acid 

constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 
inhibitor of said protease; 

(iii) transforming said host cells of (i) 
15 with said nucleic acid constructs of (ii) ; and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 
said selectable phenotype, thereby detecting the presence 
of said protease inhibitor; and 

20 (b) control sequences that are operably linked 

to said coding sequence whereby said coding sequence can 
be transcribed and translated in a host cell, and at 
least one of said control sequences is heterologous to at 
least said coding sequence. 

25 

25. A host cell stably transformed with a DNA 
construct according to claim 14. 

26. The host cell of claim 25 further 
30 transformed with a DNA construct comprising: 

(a) a DNA coding sequence for a protein 
capable of inhibiting the action of ZYMV 49 kDa protease, 
said protein identified by a method comprising 

35 
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(i) providing a population of host cells 
expressing a first nucleic acid sequence encoding a 
protease and a second nucleic acid sequence encoding a 
protein capable of conferring a selectable phenotype on 

5 said host cells dependent on the activity of said 
protease ; 

(ii) providing a pool of nucleic acid 
constructs wherein at least one of said constructs in 
said pool comprises a nucleic acid sequence encoding an 

10 inhibitor of said protease; 

(iii) transforming said host cells of (i) 
with said nucleic acid constructs of (ii) ; and 

(iv) growing said transformed host cells 
of (iii) under conditions that distinguish cells with 

15 said selectable phenotype, thereby detecting the presence 

of said protease inhibitor; and 

(b) control sequences that are operably linked 

to said coding sequence whereby said coding sequence can 

be transcribed and translated in a host cell, and at 
20 least one of said control sequences is heterologous to at 

least said coding sequence. 
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